113 research outputs found
Convex optimization over intersection of simple sets: improved convergence rate guarantees via an exact penalty approach
We consider the problem of minimizing a convex function over the intersection
of finitely many simple sets which are easy to project onto. This is an
important problem arising in various domains such as machine learning. The main
difficulty lies in finding the projection of a point in the intersection of
many sets. Existing approaches yield an infeasible point with an
iteration-complexity of for nonsmooth problems with no
guarantees on the in-feasibility. By reformulating the problem through exact
penalty functions, we derive first-order algorithms which not only guarantees
that the distance to the intersection is small but also improve the complexity
to and for smooth functions. For
composite and smooth problems, this is achieved through a saddle-point
reformulation where the proximal operators required by the primal-dual
algorithms can be computed in closed form. We illustrate the benefits of our
approach on a graph transduction problem and on graph matching
How Many Pairwise Preferences Do We Need to Rank A Graph Consistently?
We consider the problem of optimal recovery of true ranking of items from
a randomly chosen subset of their pairwise preferences. It is well known that
without any further assumption, one requires a sample size of for
the purpose. We analyze the problem with an additional structure of relational
graph over the items added with an assumption of
\emph{locality}: Neighboring items are similar in their rankings. Noting the
preferential nature of the data, we choose to embed not the graph, but, its
\emph{strong product} to capture the pairwise node relationships. Furthermore,
unlike existing literature that uses Laplacian embedding for graph based
learning problems, we use a richer class of graph
embeddings---\emph{orthonormal representations}---that includes (normalized)
Laplacian as its special case. Our proposed algorithm, {\it Pref-Rank},
predicts the underlying ranking using an SVM based approach over the chosen
embedding of the product graph, and is the first to provide \emph{statistical
consistency} on two ranking losses: \emph{Kendall's tau} and \emph{Spearman's
footrule}, with a required sample complexity of pairs, being the \emph{chromatic
number} of the complement graph . Clearly, our sample complexity is
smaller for dense graphs, with characterizing the degree of node
connectivity, which is also intuitive due to the locality assumption e.g.
for union of -cliques, or for random
and power law graphs etc.---a quantity much smaller than the fundamental limit
of for large . This, for the first time, relates ranking
complexity to structural properties of the graph. We also report experimental
evaluations on different synthetic and real datasets, where our algorithm is
shown to outperform the state-of-the-art methods.Comment: In Thirty-Third AAAI Conference on Artificial Intelligence, 201
Second order cone programming approaches for handling missing and uncertain data
We propose a novel second order cone programming formulation for designing robust classifiers
which can handle uncertainty in observations. Similar formulations are also derived for designing
regression functions which are robust to uncertainties in the regression setting. The proposed formulations
are independent of the underlying distribution, requiring only the existence of second order
moments. These formulations are then specialized to the case of missing values in observations
for both classification and regression problems. Experiments show that the proposed formulations
outperform imputation
Random Separating Hyperplane Theorem and Learning Polytopes
The Separating Hyperplane theorem is a fundamental result in Convex Geometry
with myriad applications. Our first result, Random Separating Hyperplane
Theorem (RSH), is a strengthening of this for polytopes. \rsh asserts that if
the distance between and a polytope with vertices and unit diameter
in is at least , where is a fixed constant in ,
then a randomly chosen hyperplane separates and with probability at
least and margin at least .
An immediate consequence of our result is the first near optimal bound on the
error increase in the reduction from a Separation oracle to an Optimization
oracle over a polytope.
RSH has algorithmic applications in learning polytopes. We consider a
fundamental problem, denoted the ``Hausdorff problem'', of learning a unit
diameter polytope within Hausdorff distance , given an optimization
oracle for . Using RSH, we show that with polynomially many random queries
to the optimization oracle, can be approximated within error .
To our knowledge this is the first provable algorithm for the Hausdorff
Problem. Building on this result, we show that if the vertices of are
well-separated, then an optimization oracle can be used to generate a list of
points, each within Hausdorff distance of , with the property
that the list contains a point close to each vertex of . Further, we show
how to prune this list to generate a (unique) approximation to each vertex of
the polytope. We prove that in many latent variable settings, e.g., topic
modeling, LDA, optimization oracles do exist provided we project to a suitable
SVD subspace. Thus, our work yields the first efficient algorithm for finding
approximations to the vertices of the latent polytope under the
well-separatedness assumption
- …